5 research outputs found
Fast and Data-Efficient Image Segmentation
Abundance and affordability of cameras has enabled scalable and affordable collection
of image data. This has led to many research opportunities both in robot-assisted
surgery and general computer vision domain related to image segmentation. In this
thesis, we focus on image segmentation problem as it is a fundamental task which
has many applications including pose estimation of surgical tools in robotic surgery
and eye tracking in head mounted displays. As a result of our work we present a
data-efficient method that does not require human annotation of data and exhibits
real-time inference.
First, we introduce the use of residual neural networks for surgical instrument
segmentation for robotic surgery. We show state of the art results on multiple instrument
segmentation datasets. Second, we introduce a neural architecture search
method that is able to find a very efficient image segmentation model capable of realtime
inference. Real-time inference is a crucial requirement for image segmentation
methods for robotic surgery. Third, to reduce the amount of annotation required
for our method, we introduce a semi-supervised approach which leverages unlabeled images and synthetic training data. Finally, we introduce the use of generative adversarial
networks for unsupervised discovery of segmentation classes from unlabeled
image data. Here, we show for this first time that this task is possible without any
annotated data. Data annotation for image segmentation is a very time consuming
procedure as it requires every pixel of an image to be classified into one of the classes.
We study the ability of recently introduced multimodal approaches like CLIP to assign
text labels to our discovered segmentation regions. At the end, we present a
model that is able to not only discover segmentation regions automatically but also
assigns text labels them
Comparative evaluation of instrument segmentation and tracking methods in minimally invasive surgery
Intraoperative segmentation and tracking of minimally invasive instruments is
a prerequisite for computer- and robotic-assisted surgery. Since additional
hardware like tracking systems or the robot encoders are cumbersome and lack
accuracy, surgical vision is evolving as promising techniques to segment and
track the instruments using only the endoscopic images. However, what is
missing so far are common image data sets for consistent evaluation and
benchmarking of algorithms against each other. The paper presents a comparative
validation study of different vision-based methods for instrument segmentation
and tracking in the context of robotic as well as conventional laparoscopic
surgery. The contribution of the paper is twofold: we introduce a comprehensive
validation data set that was provided to the study participants and present the
results of the comparative validation study. Based on the results of the
validation study, we arrive at the conclusion that modern deep learning
approaches outperform other methods in instrument segmentation tasks, but the
results are still not perfect. Furthermore, we show that merging results from
different methods actually significantly increases accuracy in comparison to
the best stand-alone method. On the other hand, the results of the instrument
tracking task show that this is still an open challenge, especially during
challenging scenarios in conventional laparoscopic surgery